Overview

Dataset statistics

Number of variables10
Number of observations442
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.7 KiB
Average record size in memory80.3 B

Variable types

NUM9
CAT1

Reproduction

Analysis started2020-06-04 10:50:36.924683
Analysis finished2020-06-04 10:51:06.665355
Duration29.74 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Variables

age
Real number (ℝ)

Distinct count58
Unique (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.6396225400041895e-16
Minimum-0.107225631607358
Maximum0.110726675453815
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1072256316
5-th percentile-0.0854304009
Q1-0.03729926643
median0.005383060374
Q30.03807590643
95-th percentile0.07076875249
Maximum0.1107266755
Range0.2179523071
Interquartile range (IQR)0.07537517286

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.308351267e+14
Kurtosis-0.6712236886
Mean-3.63962254e-16
Median Absolute Deviation (MAD)0.03632538451
Skewness-0.231381533
Sum-1.608713163e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.01628067573194.3%
 
0.04170844488173.8%
 
0.009015598825163.6%
 
-0.02730978568153.4%
 
0.04534098334143.2%
 
0.01264813728143.2%
 
-0.05273755484143.2%
 
-0.001882016528143.2%
 
0.005383060374132.9%
 
0.06713621404132.9%
 
Other values (48)29366.3%
 
ValueCountFrequency (%) 
-0.107225631630.7%
 
-0.103593093230.7%
 
-0.0999605547120.5%
 
-0.0963280162540.9%
 
-0.092695477840.9%
 
ValueCountFrequency (%) 
0.110726675520.5%
 
0.0961965216520.5%
 
0.092563983210.2%
 
0.0889314447510.2%
 
0.085298906310.2%
 

sex
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
-0.04464163651
235
0.05068011874
207
ValueCountFrequency (%) 
-0.0446416365123553.2%
 
0.0506801187420746.8%
 

Length

Max length18
Median length18
Mean length18
Min length18

bmi
Real number (ℝ)

Distinct count163
Unique (%)36.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-8.013951493363262e-16
Minimum-0.0902752958985185
Maximum0.17055522598066
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.0902752959
5-th percentile-0.06656343027
Q1-0.03422906806
median-0.00728376621
Q30.03124801543
95-th percentile0.08540807214
Maximum0.170555226
Range0.2608305219
Interquartile range (IQR)0.06547708349

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-5.942018449e+13
Kurtosis0.09509447428
Mean-8.013951493e-16
Median Absolute Deviation (MAD)0.03125655014
Skewness0.5981484879
Sum-3.54216656e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.0245287593981.8%
 
-0.0309956318481.8%
 
-0.0460850008771.6%
 
-0.00836157828471.6%
 
-0.0256065714771.6%
 
0.0142724752761.4%
 
-0.0331512559861.4%
 
-0.0234509473261.4%
 
0.00133873038161.4%
 
-0.020217511161.4%
 
Other values (153)37584.8%
 
ValueCountFrequency (%) 
-0.090275295910.2%
 
-0.0891974838210.2%
 
-0.0848862355310.2%
 
-0.0838084234610.2%
 
-0.0816527993120.5%
 
ValueCountFrequency (%) 
0.17055522610.2%
 
0.160854917310.2%
 
0.137143051710.2%
 
0.128520555110.2%
 
0.12744274310.2%
 

bp
Real number (ℝ)

Distinct count100
Unique (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2898179256674614e-16
Minimum-0.112399602060758
Maximum0.132044217194516
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1123996021
5-th percentile-0.07435588089
Q1-0.0366564468
median-0.005670610555
Q30.03564383777
95-th percentile0.08367188395
Maximum0.1320442172
Range0.2444438193
Interquartile range (IQR)0.07230028457

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)3.691920129e+14
Kurtosis-0.5327797228
Mean1.289817926e-16
Median Absolute Deviation (MAD)0.03442870694
Skewness0.2906638512
Sum5.700995231e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.005670610555214.8%
 
-0.04009931749214.8%
 
-0.02632783472204.5%
 
0.02187235499153.4%
 
-0.0332135761143.2%
 
-0.02288496402132.9%
 
-0.01255635194112.5%
 
0.04941532054112.5%
 
-0.01599922264112.5%
 
0.00810087222112.5%
 
Other values (90)29466.5%
 
ValueCountFrequency (%) 
-0.112399602110.2%
 
-0.108956731410.2%
 
-0.1020709910.2%
 
-0.100923366410.2%
 
-0.0986281192910.2%
 
ValueCountFrequency (%) 
0.132044217210.2%
 
0.125158475810.2%
 
0.107944122330.7%
 
0.104501251620.5%
 
0.10105838110.2%
 

s1
Real number (ℝ)

Distinct count141
Unique (%)31.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-9.042540472060098e-17
Minimum-0.126780669916514
Maximum0.153913713156516
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1267806699
5-th percentile-0.07311850845
Q1-0.0342478402
median-0.004320865537
Q30.02835801485
95-th percentile0.08367131975
Maximum0.1539137132
Range0.2806943831
Interquartile range (IQR)0.06260585505

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-5.26611385e+14
Kurtosis0.2329479047
Mean-9.042540472e-17
Median Absolute Deviation (MAD)0.03095893931
Skewness0.3781082069
Sum-3.996802889e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.007072771253102.3%
 
-0.03734373413102.3%
 
0.0204462859192.0%
 
0.0121905687692.0%
 
0.00118294589681.8%
 
-0.00294491267881.8%
 
-0.0249601584181.8%
 
-0.00432086553781.8%
 
0.0245741444981.8%
 
-0.00569681839571.6%
 
Other values (131)35780.8%
 
ValueCountFrequency (%) 
-0.126780669910.2%
 
-0.108893282810.2%
 
-0.104765424210.2%
 
-0.103389471310.2%
 
-0.100637565610.2%
 
ValueCountFrequency (%) 
0.153913713210.2%
 
0.152537760310.2%
 
0.133274420310.2%
 
0.127770608920.5%
 
0.12639465610.2%
 

s2
Real number (ℝ)

Distinct count302
Unique (%)68.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3011211012575365e-16
Minimum-0.115613065979398
Maximum0.198787989657293
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.115613066
5-th percentile-0.07271172671
Q1-0.03035839726
median-0.003819065121
Q30.02984439452
95-th percentile0.07946276829
Maximum0.1987879897
Range0.3144010556
Interquartile range (IQR)0.06020279178

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)3.659847463e+14
Kurtosis0.6013811504
Mean1.301121101e-16
Median Absolute Deviation (MAD)0.0299056781
Skewness0.4365918037
Sum5.750955268e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.0162224364351.1%
 
-0.00100072896451.1%
 
-0.0248000120640.9%
 
-0.0470335528540.9%
 
-0.013839815940.9%
 
0.05661858840.9%
 
-0.00381906512130.7%
 
-0.0232342697530.7%
 
-0.0157187066730.7%
 
0.00620168565730.7%
 
Other values (292)40491.4%
 
ValueCountFrequency (%) 
-0.11561306610.2%
 
-0.112794729810.2%
 
-0.10684490910.2%
 
-0.104339721410.2%
 
-0.100895088310.2%
 
ValueCountFrequency (%) 
0.198787989710.2%
 
0.155886650410.2%
 
0.131461070410.2%
 
0.130208476510.2%
 
0.128016437310.2%
 

s3
Real number (ℝ)

Distinct count63
Unique (%)14.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.563971121592555e-16
Minimum-0.10230705051742
Maximum0.181179060397284
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1023070505
5-th percentile-0.06549067248
Q1-0.03511716059
median-0.006584467611
Q30.02931150098
95-th percentile0.07790911999
Maximum0.1811790604
Range0.2834861109
Interquartile range (IQR)0.06442866157

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.043368732e+14
Kurtosis0.9815074614
Mean-4.563971122e-16
Median Absolute Deviation (MAD)0.03129392133
Skewness0.7992551183
Sum-2.017275236e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.01394774322225.0%
 
-0.04340084565194.3%
 
-0.03971920785184.1%
 
-0.002902829807153.4%
 
-0.03235593224153.4%
 
0.008142083605153.4%
 
-0.02131101883153.4%
 
-0.02867429444153.4%
 
-0.006584467611143.2%
 
0.01550535921143.2%
 
Other values (53)28063.3%
 
ValueCountFrequency (%) 
-0.102307050510.2%
 
-0.0986254127110.2%
 
-0.0912621371110.2%
 
-0.0802172236920.5%
 
-0.0765355858951.1%
 
ValueCountFrequency (%) 
0.181179060410.2%
 
0.177497422610.2%
 
0.173815784810.2%
 
0.159089233610.2%
 
0.15172595810.2%
 

s4
Real number (ℝ)

Distinct count66
Unique (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8631742350078977e-16
Minimum-0.076394503750001
Maximum0.185234443260194
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.07639450375
5-th percentile-0.07639450375
Q1-0.03949338287
median-0.002592261998
Q30.03430885888
95-th percentile0.08076737006
Maximum0.1852344433
Range0.261628947
Interquartile range (IQR)0.07380224175

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)1.232640433e+14
Kurtosis0.4444016718
Mean3.863174235e-16
Median Absolute Deviation (MAD)0.03690112088
Skewness0.7353736479
Sum1.707523012e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.0394933828712829.0%
 
-0.00259226199810824.4%
 
0.034308858886815.4%
 
0.07120997975337.5%
 
-0.07639450375286.3%
 
0.1081111006132.9%
 
0.145012221520.5%
 
-0.0214118336420.5%
 
-0.0376483268320.5%
 
0.0158582984420.5%
 
Other values (56)5612.7%
 
ValueCountFrequency (%) 
-0.07639450375286.3%
 
-0.0708593356210.2%
 
-0.0693832907810.2%
 
-0.0535158088110.2%
 
-0.0516707527610.2%
 
ValueCountFrequency (%) 
0.185234443310.2%
 
0.155344535410.2%
 
0.145012221520.5%
 
0.141322109410.2%
 
0.130251773210.2%
 

s5
Real number (ℝ)

Distinct count184
Unique (%)41.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.848103334221131e-16
Minimum-0.126097385560409
Maximum0.133598980013008
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1260973856
5-th percentile-0.0721284546
Q1-0.03324878725
median-0.001947634157
Q30.03243322578
95-th percentile0.07904666678
Maximum0.13359898
Range0.2596963656
Interquartile range (IQR)0.06568201303

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.237468007e+14
Kurtosis-0.1343658334
Mean-3.848103334e-16
Median Absolute Deviation (MAD)0.03314062486
Skewness0.2917738324
Sum-1.700861674e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-0.01811826731112.5%
 
-0.03075120986102.3%
 
-0.0411803851981.8%
 
-0.0259524244471.6%
 
-0.0514005352671.6%
 
-0.0332487872571.6%
 
-0.0236445575761.4%
 
-0.0109044358561.4%
 
-0.0611765950961.4%
 
0.0155668445461.4%
 
Other values (174)36883.3%
 
ValueCountFrequency (%) 
-0.126097385610.2%
 
-0.104364820810.2%
 
-0.101643547910.2%
 
-0.0964332228940.9%
 
-0.0939356455110.2%
 
ValueCountFrequency (%) 
0.1335989820.5%
 
0.133395733810.2%
 
0.132372649310.2%
 
0.130080609510.2%
 
0.129019411610.2%
 

s6
Real number (ℝ)

Distinct count56
Unique (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.39848812741592e-16
Minimum-0.137767225690012
Maximum0.135611830689079
Zeros0
Zeros (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum-0.1377672257
5-th percentile-0.07563562197
Q1-0.03317902609
median-0.0010776975
Q30.0279170509
95-th percentile0.0817644408
Maximum0.1356118307
Range0.2733790564
Interquartile range (IQR)0.06109607699

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.401183286e+14
Kurtosis0.2369167379
Mean-3.398488127e-16
Median Absolute Deviation (MAD)0.0289947484
Skewness0.2079166162
Sum-1.502131752e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.003064409414225.0%
 
0.01963283707204.5%
 
0.007206516329204.5%
 
-0.0010776975194.3%
 
-0.01764612516163.6%
 
-0.01350401824163.6%
 
-0.03835665973153.4%
 
-0.00936191133143.2%
 
-0.005219804415143.2%
 
0.01549073016143.2%
 
Other values (46)27261.5%
 
ValueCountFrequency (%) 
-0.137767225710.2%
 
-0.129483011920.5%
 
-0.104630370420.5%
 
-0.0963461565420.5%
 
-0.0922040496340.9%
 
ValueCountFrequency (%) 
0.135611830730.7%
 
0.131469723820.5%
 
0.127327616910.2%
 
0.11904340320.5%
 
0.106617082340.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

agesexbmibps1s2s3s4s5s6
00.0380760.0506800.0616960.021872-0.044223-0.034821-0.043401-0.0025920.019908-0.017646
1-0.001882-0.044642-0.051474-0.026328-0.008449-0.0191630.074412-0.039493-0.068330-0.092204
20.0852990.0506800.044451-0.005671-0.045599-0.034194-0.032356-0.0025920.002864-0.025930
3-0.089063-0.044642-0.011595-0.0366560.0121910.024991-0.0360380.0343090.022692-0.009362
40.005383-0.044642-0.0363850.0218720.0039350.0155960.008142-0.002592-0.031991-0.046641
5-0.092695-0.044642-0.040696-0.019442-0.068991-0.0792880.041277-0.076395-0.041180-0.096346
6-0.0454720.050680-0.047163-0.015999-0.040096-0.0248000.000779-0.039493-0.062913-0.038357
70.0635040.050680-0.0018950.0666300.0906200.1089140.0228690.017703-0.0358170.003064
80.0417080.0506800.061696-0.040099-0.0139530.006202-0.028674-0.002592-0.0149560.011349
9-0.070900-0.0446420.039062-0.033214-0.012577-0.034508-0.024993-0.0025920.067736-0.013504

Last rows

agesexbmibps1s2s3s4s5s6
4320.009016-0.0446420.055229-0.0056710.0575970.044719-0.0029030.0232390.0556840.106617
433-0.027310-0.044642-0.060097-0.0297710.0465890.0199800.122273-0.039493-0.051401-0.009362
4340.016281-0.0446420.0013390.0081010.0053110.0108990.030232-0.039493-0.0454210.032059
435-0.012780-0.044642-0.023451-0.040099-0.0167040.004636-0.017629-0.002592-0.038459-0.038357
436-0.056370-0.044642-0.074108-0.050428-0.024960-0.0470340.092820-0.076395-0.061177-0.046641
4370.0417080.0506800.0196620.059744-0.005697-0.002566-0.028674-0.0025920.0311930.007207
438-0.0055150.050680-0.015906-0.0676420.0493410.079165-0.0286740.034309-0.0181180.044485
4390.0417080.050680-0.0159060.017282-0.037344-0.013840-0.024993-0.011080-0.0468790.015491
440-0.045472-0.0446420.0390620.0012150.0163180.015283-0.0286740.0265600.044528-0.025930
441-0.045472-0.044642-0.073030-0.0814140.0837400.0278090.173816-0.039493-0.0042200.003064